首页> 外文OA文献 >Nearly Optimal Sample Size in Hypothesis Testing for High-Dimensional Regression
【2h】

Nearly Optimal Sample Size in Hypothesis Testing for High-Dimensional Regression

机译:高维数据假设检验中的近似最优样本量   回归

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We consider the problem of fitting the parameters of a high-dimensionallinear regression model. In the regime where the number of parameters $p$ iscomparable to or exceeds the sample size $n$, a successful approach uses an$\ell_1$-penalized least squares estimator, known as Lasso. Unfortunately,unlike for linear estimators (e.g., ordinary least squares), nowell-established method exists to compute confidence intervals or p-values onthe basis of the Lasso estimator. Very recently, a line of work\cite{javanmard2013hypothesis, confidenceJM, GBR-hypothesis} has addressed thisproblem by constructing a debiased version of the Lasso estimator. In thispaper, we study this approach for random design model, under the assumptionthat a good estimator exists for the precision matrix of the design. Ouranalysis improves over the state of the art in that it establishes nearlyoptimal \emph{average} testing power if the sample size $n$ asymptoticallydominates $s_0 (\log p)^2$, with $s_0$ being the sparsity level (number ofnon-zero coefficients). Earlier work obtains provable guarantees only for muchlarger sample size, namely it requires $n$ to asymptotically dominate $(s_0\log p)^2$. In particular, for random designs with a sparse precision matrix we show thatan estimator thereof having the required properties can be computedefficiently. Finally, we evaluate this approach on synthetic data and compareit with earlier proposals.
机译:我们考虑拟合高维线性回归模型参数的问题。在参数$ p $的数量可等于或超过样本大小$ n $的体制中,一种成功的方法是使用被惩罚的最小二乘估计子\\ ell_1 $,即Lasso。不幸的是,与线性估计器(例如,普通最小二乘法)不同,存在基于Lasso估计器来计算置信区间或p值的成熟方法。最近,一个工作\引文{javanmard2013假设,信心JM,GBR假设}通过构造拉索估计器的无偏版本解决了这个问题。在本文中,我们在设计的精度矩阵存在良好估计的前提下,研究了这种用于随机设计模型的方法。如果样本量$ n $渐近地主导$ s_0(\ log p)^ 2 $,而spar $是稀疏度,则我们的分析对现有技术进行了改进,建立了几乎最优的\ emph {average}测试能力。 -零系数)。较早的工作仅对更大的样本量获得可证明的保证,即需要$ n $渐近地支配$(s_0 \ log p)^ 2 $。特别地,对于具有稀疏精度矩阵的随机设计,我们表明可以有效地计算具有所需属性的估计量。最后,我们在综合数据上评估此方法,并将其与早期建议进行比较。

著录项

  • 作者单位
  • 年度 2013
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号